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We study four citrate synthase homodimeric proteins within a structure-based coarse-grained model. Two of 
these proteins come from thermophilic bacteria, one from a cryophilic bacterium and one from a mesophilic 
organism; three are in the closed and two in the open conformations. Even though the proteins belong 
to the same fold, the model distinguishes the properties of these proteins in a way which is consistent with 
experiments. For instance, the thermophilic proteins are more stable thermodynamically than their mesophilic 
and cryophilic homologues, which we observe both in the magnitude of thermal fluctuations near the native 
state and in the kinetics of thermal unfolding. The level of stability correlates with the average coordination 
number for amino acids contacts and with the degree of structural compactness. The pattern of positional 
fluctuations along the sequence in the closed conformation is different than in the open conformation, including 
within the active site. The modes of correlated and anticorrelated movements of pairs of amino acids forming 
the active site are very different in the open and closed conformations. Taken together, our results show 
that the precise location of amino acid contacts in the native structure appears to be a critical element in 
explaining the similarities and differences in the thermodynamic properties, local flexibility and collective 
motions of the different forms of the enzyme. 
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I. INTRODUCTION 

The primary characteristics of globular proteins are 
their native structures and sequences of amino acids. An 
evolutionary divergence from a common ancestor may 
lead diverse sequences to fold to nearly the same native 
structure. One particularly interesting example is citrate 
synthase (CS), an enzyme that is found in most living 
organisms, from bacteria to mani. It acts as a part of the 
Krebs cycle that generates ATP—. The native structure 
of CS from a bacteria living in hot hydrothermal vents 
is superimposable with that of CS from a bacteria living 
under the conditions of extreme cold^ and yet properties 
of these proteins, such as the thermodynamic stability, 
are distinct so that the enzymatic action - conversion of 
acetyl-CoA and oxaloacetate into CoA and citrate^ - is 
executed properly. 

In this paper, we investigate to what extent structure- 
based coarse-grained models, or rather their one specific 
implementation, can capture physical differences between 
proteins which belong to the same fold. We study four 
CS proteins and demonstrate that their properties are in¬ 
deed distinct in the model due to the existence of slight 
differences in the native structures. In particular, the 
model thermophilic proteins are found to have stronger 
thermodynamic stability than those which are cryophilic 
(an equivalent term is: psychrophilic). However, we also 
find that the root-mean-square single-site fluctuations are 
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alike within the very region of the active site of the ther¬ 
mophilic and cryophilic forms, even though they are dis¬ 
tinct in other parts of the proteins. 

The structure-based dynamical models follow from 
Go’s idea4d— that kinetic and thermodynamic properties 
of proteins should depend primarily on the geometry of 
the native structure and less so directly on the specificity 
of the sequence. There are many variants of such models 
(see, for instance. Refs— “— ). They are not equivalent, 
but their predictions are expected to be most accurate in 
a vicinity of the native state - a situation encountered, 
for instance, during stretching manipulationsi^r— . It has 
also been argued that protein folding processes may be 
proceeding as in the Go-like models due to minimiza¬ 
tion of hindrances to foldingi2r— . Large-scale conforma¬ 
tional changes in proteins can be explored by multi-state 
Go-type models^i^i and mixed elastic network models2^. 
In contrast, standard elastic network models, which are 
less demanding computationally than molecular dynam¬ 
ics simulations, are known to capture local fluctuations 
about the native state^S. Here, we employ a standard 
Go-type model to examine thermal stability, local flex¬ 
ibility, collective motions and unfolding kinetics of GS 
enzymes functioning in organisms that live in very differ¬ 
ent environmental conditions. 

The adaptations of life to different environmental con¬ 
ditions can be observed at various levels of organiza¬ 
tion, including the molecular level. The composition and 
structure of proteins from organisms living under extreme 
conditions^Sr— are known to correlate with the character 
of the environment. For instance, thermophilic organ¬ 
isms, i.e. those thriving between 45° G and 120° G, tend 
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to contain proteins with smaller cavities, bigger num¬ 
ber of ionic bonds, increased polarity of exposed sur¬ 
faces, an increased content of charged residues and tryp¬ 
tophan, and a smaller content of phenylalanine, methio¬ 
nine and asparagine compared to proteins in mesophilic 
organisms^—. On the other hand, cryophilic organ¬ 
isms, i.e. those which grow and reproduce between about 
-20° C to -1-10° C, tend to contain proteins with larger 
catalytic cavities, reduced content of proline and arginine 
(to make the backbone more flexible), increased content 
of clusters of glycines, less hydrophobic cores (to make 
the protein less compact), a higher proportion of non¬ 
polar groups on the surface, and an increased content of 
negative charges on the surface (to facilitate interactions 
with the solventStructure-based models cannot 
explicitly take all of these many detailed chemical fea¬ 
tures into account, but these features lead to a particu¬ 
lar structure of the native conformation. Therefore, such 
models can tell the thermophilic and cryophilic proteins 
apart even if they belong to the same fold. We shall in¬ 
quire here - to what extent. These models also identify 
properties that are similar. We illustrate these aspects 
here by considering CS. 

CS is an a-protein homodimer. Fig. [T] shows an ex¬ 
ample of the native conformation - it is for CS from 
Pyrococcus furiosus with the Protein Data Bank (PDB) 
code 1AJ8 and atomic coordinates resolved for 741 amino 
acids^. It is formed of two identical subunits, each with 
its own active site. Each of the subunits comprises two 
domains - a small domain, which comprises five a-helices, 
and and a large domain which contains 13 a-helices. In 
other species, the number of the helices in the large do¬ 
main varies between 11 and 15. The substrate binding 
site is located in a cleft between the two domains. 

Two kinds of conformations of CS have been identi¬ 
fied: one is termed ’open’ and the other ’closed’. CS is 
in the ’open’ conformation when the active site is not 
occupied and is thus free to bind substrate molecules for 
the enzyme catalysis. When the substrate binds, the 
small domain undergoes a rotation, sealing the substrate 
binding site and forming the ’closed’ conformation. A 
number of crystal structures of CS from different organ¬ 
isms are available in the PDB. The five structures we 
have considered in our study are summarized in Table ID 
It would be ideal to also consider an open form of CS 
form a cryophilic organism, but they do not seem to be 
available in the PDB. 


II. METHODS 


We use a coarse-grained continuum representation for 
CS conhgurations in which only the positions of C^ 
atoms are retained in the molecular dynamics simula¬ 
tions. However, the contact map, i.e. a list of non- 
bonded interactions is determined based on the all-atom 
coordinates. The configurational energy of the model 



FIG. 1. Structure of citrate synthase homodimer from Pyro¬ 
coccus furiosus in the closed conformation (PDB code 1AJ8). 
The reaction products (citrate and CoA) are shown in the 
stick representation. The small and large domain in chain 
A (first monomer) are shown in blue and green, respectively, 
whereas those in chain B (second monomer) in gray and or¬ 
ange. 


PDB code 
and 

reference 

confor¬ 

mation 

organism 

optimal 

growth 

temperature 

lAJSi^S 

closed 

Pyrococcus furiosus 

100° C 

2CTS^ 

closed 

Sus scrofa (pig) 

37° C 

1A59^ 

closed 

Antarctic bacterium 

0° C 

2IBP^ 

Open 

Pyrobaculum aerophilum 

100° C 

ICTS^ 

open 

Sus scrofa (pig) 

37° C 


TABLE I. Structures of CS from different organisms used in 
this study. Structures of the closed conformation contain the 
reaction products (CoA and citrate). The products are absent 
in the open conformation structures. 
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Here, r and rg denote the distances between two subse¬ 
quent residues at configurations X and Ag, respectively. 
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where the reference configuration Xq corresponds to the 
native state. The summation over bonds includes peptide 
and disulfide bonds (the latter are present in 2IBP, one 
in each monomer). Analogously, </> and (jjQ represent the 
bond angles formed by three subsequent residues along 
the amino acid chain at configurations X and Xq, respec¬ 
tively. Next, 9 and 9q are the dihedral angles formed by 
four sequential residues at configurations X and Xq . The 
coupling parameters in the first three terms in Eq. (HD are 
taken to be as in Ref4, i.e., = 100 e/A^, Kg = 20 e, 

ifW = e, and = e/2. 

The fourth term in Eq. m describes the native in¬ 
teractions. Here, and roij denote the distances at 
configurations X and Xq, respectively, between residues 
i and j forming the native contacts. The native contacts 
are identified as in RefJ^d^i^, i.e., by using an overlap 
criterion^ applied to the coordinates of all heavy atoms 
in the native structure. The heavy atoms are assigned 
enlarged van der Waals spheres and if there is a pair of 
atoms for which one finds spheres that overlap in the na¬ 
tive state then the corresponding pair of amino acids is 
considered as forming a native contact. Physically, these 
contacts are due to hydrogen bonds and ionic bridges. 
Only contacts with |i — / | > 3 are included in the con¬ 
tact map. The last term in Eq. (HD describes repulsion 
between non-native contacts, with po = A A. 

The model described by Eq. HD is different from the 
one used in our recent studies of protein stretchingi^^— : 
the contact potential is of the 10-12 type, and not 6-12, 
and the local backbone stiffness is given by the usual 
bond-angle and dihedral-angle potentials instead of a 
term involving the local chirality (which effectively in¬ 
cludes the dihedral term but not the bond angle terms^i) . 
The 6-12 model yields amino-acid positional fluctua¬ 
tions substantially larger than those found in experiments 
through the temperature factors, and coming close to the 
experimental data requires rescaling as in Ref4^. Never¬ 
theless, we expect that the calibration of e is of the same 
order, i.e. e « 110 pN A. Thus the room temperature 
should be in the vicinity of 0.35 e/ks, where ks denotes 
the Boltzmann constant. 

In our coarse-grained simulations, the citrate and CoA 
molecules (present in the closed structures of CS) are rep¬ 
resented by one and four beads, respectively. The con¬ 
tacts between CS and citrate, and between CS and CoA, 
are found using the same all-atom overlap criterion as 
the native contacts between the amino acids within CS. 
To avoid dissociation of the citrate and CoA molecules 
from CS during simulations, these contacts are replaced 
by harmonic bonds analogous to the C^-Ca pseudobonds 
with the spring constant Kr = 100 e/A^. 

To study thermodynamic properties of the CS dimers, 
we performed overdamped Langevin dynamics simula¬ 
tions using in-house softwarei^*^. The simulations were 
carried out at 31 different temperatures T distributed 
uniformly in the interval from O.le/fcs to O.Te/fcs. Each 
simulation was 10® r long and it was preceded by a 10^ r 
equilibration run. The unit of time, r, is of order 1 ns. At 


these temperatures and time scales the dimers never dis¬ 
sociate or unfold. We observe only small and moderate 
deviations from the native state as measured by RMSD, 
see Figs.lSb and[3J) in the next section. 

In the course of the simulations, we monitor the num¬ 
ber of contacts 


M 

( 2 ) 

i=l 

where M is the total number of contacts in the native 
state, Tijit) is the distance between residues i and j at 
simulation time t, dij = 1.2 is a cutoff distance, and 
9 is the the Heaviside function; 9{x) = 1 if x > 0 and 
9{x) = 0 if X < 0. To assess thermal stability of the en¬ 
zyme, we compute the probability of finding the enzyme 
in the native state 


^0 — 


( 3 ) 


as a function of temperature T, where the brackets de¬ 
note time average after equilibration, and Sm,M is the 
Kronecker delta: Sm,M = 1 if m = M and Sm,M = 0 oth¬ 
erwise. Notice that the definition of Pq involves counting 
conformations in which all native contacts are present. 
The native state probability as given by Eq. m is thus 
different from the average fraction of contacts 




( 4 ) 


We define temperature of thermodynamical stability, Tf, 
as one at which Pq = ^. This temperature is different 
and substantially lower than the temperature at which 
Q = ^. Our definition of T/ yields values of T that are 
in a vicinity of temperatures at which smaller proteins 
fold optimality^. 

To describe how far a particular configuration has de¬ 
parted from the native state, we compute the root mean 
square deviation 


RMSD(t) 



( 5 ) 


where denote the positions of Ca atoms in the na¬ 

tive state and fl are positions of the Ca atoms at time t 
after superimposing on the native structure. We use the 
Kabsch algorithm^ to superimpose the instantaneous 
structures on the native structure. After equilibration, 
RMSD fluctuates around its average value, (RMSD), 
which is a function of temperate T. 

To quantify the local flexibility of the enzyme, we com¬ 
pute the root mean square fluctuation (RMSF) of each 
residue 


A = 



( 6 ) 


Here fl is the position of the f-th Cq, atom, and the angle 
brackets denote the average after superimposing on the 
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native structure. The RMSFs are directly related to the 
so called temperature factors^S 


Pi = 



( 7 ) 


which can be determined in biomolecular crystallography 
experiments and are listed in PDB files. 

The correlation in the motions of residues i and j are 
given by 




{6ri ■ 5fj) 




( 8 ) 


where Sfl = fl — (fl) is the displacement from the av¬ 
erage position computed after superimposing on the na¬ 
tive structure. Significant positive correlations mean that 
the residues tend to move together, possibly as part of a 
larger structural motif. Negative correlations imply that 
the two residues tend to move in opposite directions. 

To study thermal unfolding and dissociation of the CS 
dimers, we performed overdamped Langevin dynamics 
simulations at elevated temperatures T between 1.1 e/ks 
and 1.7 e/kB- At each of temperatures we run 200 trajec¬ 
tories to study kinetics of the unfolding and dissociation 
processes. 


III. RESULTS 

A. Thermodynamic properties of CS dimers 



FIG. 2. Thermodynamic properties of citrate synthase from 
thermophilic (1AJ8, dashed lines), mesophilic (2CTS, dot¬ 
ted lines) and cryophilic (1A59, thin solid lines) organisms in 
the closed conformation, (a) The probability of finding the 
enzyme in the native state, Po, (b) the average root mean 
square deviation (RMSD) from the native state, (c) the aver¬ 
age internal energy per residue, and (d) the average radius of 
gyration, (Rg), as functions of temperature T. 


We first compare some thermodynamic properties of 
the CS dimers from thermophilic and cryophilic organ¬ 
isms in the closed conformation (PDB codes 1AJ8 and 


1A59, respectively). Fig. [2^ shows that, at any a given 
T, the probability Pq of finding the enzyme in the na¬ 
tive state is larger for the thermophilic CS than for 
the cryophilic one. In particular, Tf is 0.303 e/fc^ and 
0.287 e/ks for the thermophilic and cryophilic CS dimers, 
respectively. With our calibration of e, the difference is 
of order 14° C. 

Fig. [2]3 shows that, at any given T, (RMSD) is larger 
for the cryophilic CS than for the thermophilic one. Thus 
the cryophilic CS undergoes larger thermal fluctuations 
than the thermophilic CS at the same T. The internal en¬ 
ergy per residue, shown in Fig[2t, is lower by about 0.2 e 
for the thermophilic CS compared to the cryophilic CS, 
independent of the value of T. This result is consistent 
with the data in Figl2^. showing that the thermophilic CS 
exhibits better thermal stability than the thermophilic 
CS. 

Interestingly, the thermophilic CS appears slightly 
more compact than the cryophilic CS, as evidenced by the 
time-averaged radius of gyration, {Rg), plotted against 
T, see Fig.[2}l (note that these two dimers comprise com¬ 
parable numbers of residues, 740 and 754, respectively). 
The difference in {Rg) is about 0.4 A or about 1.5% of 
the native Rg. We have checked that, at a given tem¬ 
perature T in the range between 0.1 e/ks and 0.7 e/ks, 
the standard deviation of the instantaneous Rg values 
is smaller than 0.1 A, both for the cryophilic and ther¬ 
mophilic forms of CS. Thus the observed difference in the 
values of {Rg) is small but statistically relevant. 

We next discuss the thermodynamic properties of the 
mesophilic CS in the closed conformation (with the PDB 
code 2CTS). We find that, at any given temperature, Pq 
for the mesophilic CS is smaller than for the thermophilic 
one and only slightly bigger than for the cryophilic CS, 
see Fig. [2^. Its stability temperature is Tf = 0.289 e/ks- 
The stability difference, as measured by alterations in 
Tf, between the thermophilic and mesophilic CS dimers 
in the closed conformation is of order of 12° C. 

Interestingly, out of the three CS dimers in the closed 
conformation, the mesophilic one shows the fastest in¬ 
crease in (RMSD) with T, see Fig. [2 )d. At any given T, 
the internal energy per residue for the mesophilic and 
cryophilic CS dimers in the closed conformation are al¬ 
most the same, see Fig. [2t. This observation is con¬ 
sistent with the results shown in Figl^, namely, that 
these two forms are characterized by very similar Po{T) 
curves. The mesophilic CS dimer comprises 874 amino 
acid residues whereas its cryophilic and thermophilic ho- 
mologues - only 754 and 740 residues, respectively. This 
explains why the mesophilic form has the larges radius 
of gyration, as shown in Fig. [2Jl. 

We next compare the thermodynamic properties of CS 
dimers from thermophilic and mesophilic organisms in 
the open conformation (PDB codes 2IBP and ICTS, 
respectively). As shown in Fig. [3^, the thermophilic 
CS exhibits larger Po{T) and is thus thermodynamically 
more stable than the mesophilic CS. The values of T/ 
are 0.285 e/fcs and 0.274 e/fcs for the thermophilic and 
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FIG. 3. Similar to Fig. [2] but for citrate synthase from ther¬ 
mophilic (2IBP) and mesophilic (ICTS) organisms in the 
open conformation. 


mesophilic CS, respectively. The difference is of order 
10° C. Fig. [3 Jd shows that the mesophilic CS exhibits 
larger (RMSD)(T) and is thus more susceptible for ther¬ 
mal fluctuations than the thermophilic CS. A larger ther¬ 
mal stability of the thermophilic CS can be also inferred 
from Fig. [3t which shows the internal energy per residue 
as a function of T. We note also that the thermophilic 
CS is more compact than than the mesophilic CS, see 
Fig. I3J1. Overall, we observe that the qualitative differ¬ 
ences between the proteins belonging to the organisms 
preferring high and low temperatures do not depend on 
whether the structures are closed or open. 

It is instructive to compare the open and closed CSs 
from the same organism (with PDB codes ICTS and 
2CTS, respectively). We note that the closed form is 
more compact than the open one. Its radius of gyra¬ 
tion is smaller by 1.2 A. The closed form exhibits a 
slower decrease of Pq with T. As mentioned before, Tf 
is 0.289 e/ks for the closed form and 0.274 e/ks for the 
open one. Since the two forms have identical sequences, 
the difference in Tf should be related to the number of 
native contacts, which is 2060 in the open conformation 
and 2138 in the closed conformation. 

The higher thermodynamic stability of the ther¬ 
mophilic proteins is expected and required by the func¬ 
tion. It is interesting to ask, however, what features of 
the native structure encode this information. It has been 
shown recently, in a similar model, that effective stiffness 
of virus capsids, as measured in nanoindentation experi¬ 
ments, correlates with average coordination number, 
Specihcally, the corresponding Young modulus is propor¬ 
tional to (z — 6)^. The average coordination number is 
defined as 

„ #native contacts -I- #bonds , , 

‘ ^ 

and it describes how many neighbors, on an average, any 
residue has. It is determined in the native state and the 
notion of neighborhood relates to the dynamics of the 


system. Table HIl demonstrates that the values of T/ cor¬ 
relate with z both in the open and closed conformation. 
The coordination number can be determined both for 
the dimers and individual monomers. The correspond¬ 
ing quantities Zdimer and Zmonomer differ because there 
are contacts formed at the interface of the two monomers. 
The number of the interfacial contacts in 1AJ8 is 289; in 
2CTS it is 302, whereas in 1A59 it is only 266. For the 
closed forms the numbers are 322 for 2IBP and 252 for 
ICTS. Whichever quantity one takes, Zdimer or Zmonomen 
the correlation with Tf is evident, which indicates that 
both bulk and interfacial contacts contribute to the larger 
stability of the thermophilic forms. The larger value of z 
is also consistent with the tighter packing of residues in 
thermophilic enzymes. 

It is interesting to ask whether the stability tempera¬ 
ture is correlated with the average contact order 

= ( 10 ) 

d 

Here, N is the number of residues in an amino acid chain, 
where M is the total number of the native contacts, and 
Aij = \i— j\ is the sequence separation between residues 
i and j that form a native contact. Note that the contact 
order is uniquely defined only for single amino acid chains 
(monomers). The values of the average contact order as 
calculated from the contact maps of the five CS forms are 
given in the last column of Table HIl Interestingly, we find 
no correlation between CO and Tf. A similar conclusion 
has been reached in Ref J2,. 


B. The temperature factors and RMSF 

To validate our simulations, we compare experimen¬ 
tal and simulational temperature factors of the differ¬ 
ent CSs, see Fig. S) The simulations have been done at 
T = 0.3e/kB which is close to Tf. Although the tem¬ 
perature factors in a crystal may in general differ from 
those in a solution, this approach has been widely used 
to parametrize elastic network models^^. 

We obtain very good agreement for the 2IBP and IA59 
structures, see panels (a) and (b) in Fig. S] For IAJ8, 
see Fig. lUJc), the simulation pattern closely resembles 
the distribution of the crystallographic /? factors but its 
magnitude is too small. However, simulations at T = 
O.de/fcs fit the experiment much better, see the blue line 
in FigllKc). We do not compare results for ICTS because 
the resolution of the mesophilic CS structure in the open 
conformation (ICTS) is only 2.7 A. The resolution for 
the structures of 2IBP, IA59, and IAJ8 is much better 
and is equal to 1.6 A, 2.1 A, and 1.9 A respectively. 

We note that the average crystallographic /3 factor, 
^ ^ A; is ,9 = 21 A^ for the thermophilic CS in 

the closed conformation (1AJ8), and ,9 = 14 A^ for the 
cryophilic CS in the closed conformation (1A59). Impor¬ 
tantly, in both cases the protein crystals have been re- 
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organism type 

conformation 

PDB code 

ksTfjt 

■^•dimer 

■^monomer 

CO 

thermophilic 

closed 

1AJ8 

0.303 

7.31 

6.54 

0.0859 

mesophilic 

closed 

2CTS 

0.289 

7.01 

6.33 

0.0784 

cryophilic 

closed 

1A59 

0.287 

6.99 

6.29 

0.0876 

thermophilic 

Open 

2IBP 

0.285 

7.04 

6.25 

0.0846 

mesophilic 

open 

ICTS 

0.274 

6.71 

6.14 

0.0811 


TABLE II. In both the open and closed conformation, Tf correlates with the average coordination number both for CS dimers 

(Zdimer) and monomers (^monomer). 
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FIG. 4. Temperature factors /3i versus the residue number 
i from X-ray crystallography experiments (see the solid lines 
in red) and results of simulations at T = O.Se/fcs (see the 
dashed lines in black). The PDB codes of the different en¬ 
zymes are given in figure panels. The vertical lines indicate 
the residues that form the active site as listed in Table SI in 
Supplementary Material^. The thick horizontal lines show 
the location of the small domain. The blue dashed line in 
panel (c) shows the result of simulations at T = QAe/ks- 
Panel (d) shows RMSF along the sequence, 5i, computed at 
T = Tf for 1AJ8 {Tf = 0.303 ejkB', see the dashed line in 
magenta) and 1A59 {Tf = 0.287 e/ks', see the dotted line in 
blue). 


ported to be obtained at the room temperature. The ex¬ 
perimental data would thus imply that the thermophilic 
CS is, on average, more flexible than its cryophilic ho- 
mologue at the same temperature. One would certainly 
expect the opposite relation. This inconsistency might be 
a reason for the quantitative disagreement between simu¬ 
lations and experiment in the case of 1AJ8, see Fig.|4l(c). 

In all of the analyzed structures, the residues that 
have the largest [3 factors are localized in the small do¬ 
main (the region highlighted in Fig. 2]), primarily in the 


segments that bind the CoA molecule. In contrast, the 
residues that bind citrate (indicated by the vertical lines 
in Fig. |4]) have relatively small temperature factors. 

When one compares the (3 factors of the thermophilic 
and cryophilic proteins in the closed conformation, see 
panels (b) and (c) of Fig. 01 then they are seen to map 
out very similar looking patterns no matter whether one 
uses the experimental results at the room temperature 
(red solid lines in Fig. 0]) or the simulational results at 
T = 0.3 e/ks (black dashed lines in Fig. 0]). When 5i 
or Pi are calculated at a higher T for the thermophilic 
form and at a lower T for the cryophilic form then the 
match in the patterns is better quantitatively. Fig. 01d) 
compares 5i for 1AJ8 and IA59 at their respective sta¬ 
bility temperatures. The amplitudes of the fluctuations 
are seen to be comparable. 


C. Fluctuations of the active site 

The citrate binding site comprises seven evolutionar- 
ily conserved amino acids: three histidine residues, three 
arginine residues and one aspartic aci d^^d^d^ . They are 
depicted in Fig. 0] and summarized in Table SI which is 
provided in Supplementary Material^. Three of them, 
which we denote here as His 2 , Hisa and Aspi, are directly 
involved in the chemical reaction that results in citrate 
formation^. They are highlighted in Table SI— in bold. 
Hisa, Argi and Apsi reside in the small domain. Hisi, 
His 2 , Arg 2 and Arga are located in the large domain (His 2 
is at the N-terminal end of the large domain). Arga is 
the only citrate-binding residue that resides on the other 
monomer. 

In Fig. 0] the citrate binding residues are labeled with 
vertical lines. All of them exhibit relatively small spatial 
fluctuations, as indicated by low /? factors. Fig. 0] shows 
the RMSFs of these seven amino acid residues as a func¬ 
tion of T. In the closed conformation, the active site in 
the thermophilic and cryophilic enzymes fluctuate in the 
same manner although, for any fixed T, the magnitude of 
fluctuations is somewhat larger in the cryophilic CS, see 
panels c and d in Fig. 01 The residues that exhibit largest 
RMSFs are Argi, Arga and Hisi. The residues that fluc¬ 
tuate most weakly are His 2 , Aspi and Hisa, which happen 
to be the three amino acids that are directly involved in 
the chemical reaction of citrate formation. 
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FIG. 5. Structures of the citrate binding site in the open 
(2IBP, left) and closed (1AJ8, right) conformations. The 
seven evolutionarily conserved residues that are involved in 
binding the citrate molecule are shown as purple spheres. The 
citrate and CoA molecules are shown in the stick representa¬ 
tion. The color code in the same as in Fig. [T] 


manner but quite differently than in the closed confor¬ 
mation, see panels a and b in Fig [6l Here, the residues 
that fluctuate the most are Hisa and Aspi, which are in¬ 
volved in the reaction of citrate formation. The residues 
that exhibit smallest fluctuations are Arg 2 and Hisi. Also 
the magnitude of RMSFs is larger than in the closed con¬ 
formation. 

Interestingly, the three residues involved directly in the 
catalytic activity, Hisa, His 2 and Aspi, exhibit rather 
small thermal fluctuations in the closed conformation, 
whereas in the open conformation the thermal fluctua¬ 
tions of these three residues are significantly enhanced. 
This result seems consistent with the division of functions 
between residues in the active site: some amino acids are 
tailored to binding the substrate, whereas others partic¬ 
ipate in the catalytic process. 

D. Collective motions of the active site 
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FIG. 6. RMSF Si of the seven residues forming the active site 
as a function of T. 


thermophilic open (2IBP) mesophilic open (ICTS) 



FIG. 7. Equal-time correlation coefficients (upper panels) 
and variation of distances (lower panels) between the seven 
residues forming the active site in the open conformation. 
The results were obtained in simulations of the thermophilic 
(left panels) and mesophilic (right panels) CS dimers at 
T = 0.3 ejkB- The largest positive correlations and the small¬ 
est distance variations are shown as red circles. The negative 
correlations and the largest distance variations are shown as 
blue squares. The intermediate cases are indicated by black 
crosses. 


Our results at temperatures between 0.35 and 
0.45 e/kB agree qualitatively with the results of exten¬ 
sive all-atom molecular dynamics simulations^^, which 
show that at r = 300 K the RMSFs of the active site 
residues in contact with the citrate molecule are in the 
range between 0.4 and 0.6 A, depending on the organ¬ 
ism (thermophilic, mesophilic or cryophilic) and bound¬ 
ary conditions (periodic or spherical). 

In the open conformation, the active site in the ther¬ 
mophilic and mesophilic enzymes fluctuate in a similar 


In order to identify collective motions of the active site 
residues, we compute the equal-time correlation coeffi¬ 
cients Cij as defined in Eq. ([5]) for all pairs {i,j) of the 
residues forming the active site. The upper panels of 
Fig. [7] show correlation levels of these pairs in the ther¬ 
mophilic (left panel) and mesophilic (right panel) en¬ 
zymes in the open conformation. The positive correla¬ 
tions with Cij > 0.15 are labeled with red circles and 
indicate that the pairs of residues often move together 
in the same direction. The negative correlations are de¬ 
picted as blue squares and indicate that the pairs have a 
tendency to move simultaneously in opposite directions. 




















The results shown in Fig. [7] were obtained in simula¬ 
tions at temperature T = 0.3 e/k^ but we checked that 
the same correlation pattern persists at temperatures be¬ 
tween 0.2 e/ks and 0.5 e/ks- 

Interestingly, all the active site residues that are lo¬ 
cated in the small domain, Hisa, Argi and Aspi, appear 
to move together as a unit, which is indicated by signif¬ 
icant positive correlations for all the three residue pairs. 
The three residues in the small domain exhibit motions 
that are anti-correlated with the motions of Hisa and 
Arga, which are placed in the large domain. Therefore, 
the collective motions of the active site in the open con¬ 
formation seem to be due to the displacements of the 
small domain relative to the large domain. 

To further quantify this observation we compute the 
variations of the inter-residue distances, Sdij = {{dij — 
{dij)Y)^/‘^, where {dij) is the average distance between 
residues i and j forming the active site. The lower panels 
of Fig. [7] illustrate the distance variations in the open con¬ 
formation of the thermophilic (left panel) and mesophilic 
(right panel) enzymes. The smallest distance variations 
(red circles) are observed for the residue pairs that exhibit 
correlated motions with Ctj > 0.15. The largest distance 
variations (blue squares) are observed almost exclusively 
for the residue pairs that exhibit negative spatial corre¬ 
lations and, thus, have the propensity to move in oppo¬ 
site directions. Importantly, these results show that the 
motions of largest amplitudes occur primarily between 
the residues located in different domains, and the small- 
amplitude motions are mainly between the residues re¬ 
siding in the same domains. 

In the open conformation of the thermophilic CS, 
we can distinguish three groups of spatially-correlated 
residues: the first group comprises the three residues in 
the small domain, Argi, Hisa and Aspi; the second group 
consists of His 1 and Arg 2 ; and the third group is His 2 and 
Arga. Within each of the groups, the individual residues 
show a propensity to move jointly with other group mem¬ 
bers. For example, Hisi and Arg 2 often move together in 
the same direction (C^ = 0.26) but almost independently 
from other residues of the active site. Interestingly, all 
the residues in the first group are spatially anti-correlated 
with the residues in the third group. These two groups 
can be thus visualized as the lower and upper ’jaws’ mov¬ 
ing in an up-and-down manner. 

In the open conformation of the mesophilic CS, we ob¬ 
serve more correlations between the active site residues 
present in the large domain. Here, the ’upper jaw’ (His 2 
and Arga) appears to be connected more stiffly to the 
’hinge’ (Hisi and Arg 2 ) but performs even larger mo¬ 
tions {Sdij > 0.2 A) against the ’lower jaw’ (Argi, Hisg 
and Aspi). 

In the closed conformation, both in the thermophilic 
and cryophilic enzymes, we observe only positive corre¬ 
lations, Cij > 0, and rather small inter-residue distance 
variations, Sdij < 0.07 A, see Fig. SI in Supplementary 
Material^. Here, due to the presence of the citrate and 
Co A molecules, the small domain is tightly bound to the 


large domain, and the active site residues have no free¬ 
dom to perform any large-amplitude motions. Therefore, 
the residues in the small domain are positively correlated 
with those in the large domain. Interestingly, the active 
sites in the thermophilic and cryophilic CS exhibit the 
same pattern of correlations, see the top panels in Fig. SI. 


E. Kinetics of thermal unfolding and dissociation 


Another way to assess thermal stability of proteins is 
to simulate their unfolding at elevated temperatures as 
analyzed theoretically in Refi^. The unfolding simula¬ 
tions start at the native state and finish when all nonlo¬ 
cal contacts get broken, which defines the unfolding time 
^unf- Specifically, the nonlocality refers to the sequential 
distance \i — j\ > 1. We take I = 10 in this study. In 
Refi^, I has been taken to be 4 to eliminate the local 
contacts in a-helices, but this choice is too demanding 
computationally in the current context. 

We performed unfolding simulations at high tempera¬ 
tures only for the CS dimers in the open conformation be¬ 
cause dissociation of the citrate and CoA molecules from 
CS occurs much faster than protein unfolding or dimer 
dissociation (kinetic constants for the successive steps of 
the CS reaction at physiological conditions are detailed, 
for example, in Refi^). At any given T between 1.1 e/ks 
and 1.7 e/fcs, we run 200 trajectories. We checked that 
tunf < 25000 T in all the simulation trajectories. 



FIG. 8. Thermal unfolding of the thermophilic (2IBP) and 
mesophilic (ICTS) CS in the open conformation. The panels 
on the left hand side show histograms of the unfolding time, 
tunf, at temperatures T = 1.25 e/fcs (2IBP, upper-left panel) 
and T = 1.15 e/ks (ICTS, lower-left panel). The panel on 
the right hand side shows the average unfolding time as a 
function of temperature. 


The histograms of tunf for the thermophilic (2IBP) 
and mesophilic (ICTS) CS in the open conformation are 
shown in the left-hand-side panels of Fig. |8l These his¬ 
tograms demonstrate that the thermophilic CS unfolds 
slower at T = 1.25 e/ks than the mesophilic CS at 
T = 1.15 e/ks, which means that the thermophilic en- 
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zyme is more resistant to thermal denaturation. Also, 
the histograms seem to display multi-peak profiles, which 
suggests that there might be more than one characteris¬ 
tic time scale governing the thermal unfolding of the CS 
dimers. To further characterize the unfolding kinetics, it 
is instructive to calculate the fraction of the number of 
trajectories, <^/(t), in which the unfolding event has not 
occurred within time t from the beginning of the simu¬ 
lation. The fraction 4’f{t) provides an estimate for the 
probability of not unfolding CS within time t. As can be 
seen in Fig. S3^, there is an initial lag phase in which 
4>f{t) = 1. The lag phase is followed by a fast decrease 
of </>/(t). Interestinlgy, we find that the decay oft/)/ with 
t is not exponential. 

The right-hand-side panel of Fig. [5] shows the average 
unfolding time, (tunf), as a function of T. The average is 
taken over 200 trajectories. The error bars correspond to 
the standard error of the mean. At any specified temper¬ 
ature, the thermophilic CS (2IBP) exhibits significantly 
larger unfolding time than the mesophilic CS (ICTS). 
This result shows that the thermophilic enzyme is ther¬ 
mally more stable than its mesophilic homologue. Based 
on the dependence of (tunf) on T as shown in Fig. |8l 
the difference in stability temperatures is of the order 
0.1 e/fcs, which corresponds to about 80° C. 

We use an analogous method to quantify thermal dis¬ 
sociation of CS dimers. The dynamics of the system is 
exactly the same as in the unfolding simulations but only 
the contacts between the monomers are monitored. The 
simulations start at the native state and finish when all 
contacts between the monomers get broken, which defines 
the unbinding time tunb- In the range of temperatures 
studied, tunb < 25000 r in all the trajectories. 
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FIG. 9. Thermal dissociation of the thermophilic (2IBP) and 
mesophilic (ICTS) CS dimers in the open conformation. The 
left-hand-side panels show histograms of the unbinding times 
at T = 1.11 e/fcfl (2IBP) and T = 1.18 e/fcs (ICTS). The 
right-hand-side panel shows (tunf) as a function of T. 


The left panels of Fig. show histograms of the un¬ 
binding time for the thermophilic (2IBP) and mesophilic 
(ICTS) CS in the open conformation at T = 1.11 e/fcs 
and T = 1.18 e/fcs, respectively. These data show that 


the thermophilic dimer is more resistant to thermal dis¬ 
sociation. To characterize the dissociation kinetics, we 
compute the fraction of the number of trajectories, 
in which the unbinding event has not occurred within 
time t. Fig. S3^ shows (/)b(t) for 2IBP and ICTS at dif¬ 
ferent temperatures. We note that the decay of </>{, with 
t is not exponential. 

The right panel of Fig. [9] shows the average unbind¬ 
ing time, (tunb), as a function of T. The average is taken 
over 200 trajectories, and the error bars correspond to the 
standard error of the mean. At any given temperature, 
{tunb) of the thermophilic CS dimer (2IBP) is signifi¬ 
cantly larger than that of the mesophilic enzyme (ICTS). 
Interestingly, the two plots of (tunb) versus T overlap 
if the temperature is shifted by about 0.07 e/ks- This 
temperature shift corresponds roughly to 60° C, which 
is consistent with the difference in the optimal growth 
temperatures, see Table [H 


IV. SUMMARY 

We performed Langevin dynamics simulations of CS 
dimers from thermophilic, mesophilic and cryophilic or¬ 
ganisms using a coarse-grained structure-based model 
that does not differentiate between the ionic, hydropho¬ 
bic or van der Waals interactions. In fact, the interac¬ 
tions between different amino acid residues are described 
in this model by the same potential energy function. Al¬ 
though this model might seem oversimplified, it correctly 
predicts that the thermophilic CS is thermally more sta¬ 
ble than the mesophilic and cryophilic ones. It also yields 
root-mean-square fluctuates of single amino acid residues 
that are fully consistent with crystallographic tempera¬ 
ture factors. Therefore, the precise location of the amino 
acid contacts appears to be a key element in explain¬ 
ing thermodynamic properties and local flexibility of en¬ 
zymes. 

The difference of the thermodynamical stability tem¬ 
peratures, AT/, can be obtained from the native state 
probabilities (see section UlI All . For the CS dimers from 
thermophilic and cryophilic organisms in the closed con¬ 
formation we obtain AT/ ss 14° C; for the thermophilic 
and mesophilic CS dimers in the open conformation we 
get AT/ « 10° C. These values are smaller than ex¬ 
pected. The expected difference in melting temperatures 
between the mesophilic and thermophilic CS dimers is 
about 20° C as reported in a recent study^. The likely 
reason for the fact that our model underestimates the 
transition temperature differences is that it lacks some 
relevant sequence effects. As discussed in section |IT1 the 
atomic structures of the CS proteins are used in the Go- 
type model only to construct the contact maps whereas 
the chemical composition of the proteins is not directly 
included in the energy function of the model as given by 
Eq. ID). 

The temperature of optimal folding, T/, is one mea¬ 
sure of the thermodynamic stability of proteins. Another 
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measure is provided by the kinetics of thermal unfolding 
(see section ITlI El) . These two measures are quite differ¬ 
ent as they refer to equilibrium properties near the native 
state and dynamic properties away from the native state, 
respectively. One measures the frequency of the situa¬ 
tions in which all contacts are present simultaneously, 
and the other focuses on rupture events up to a given 
sequential length. The stability temperature difference 
obtained from the thermal unfolding of CS dimers from 
thermophilic and mesophilic organisms (see Fig. [5]) is of 
the order of 80° C. We note that this value can be affected 
by the partition of the native contacts into local and non¬ 
local (we used the sequential distance cut-off I = 10 in 
our analysis). On the other hand, the stability tempera¬ 
ture difference obtained from the thermal dissociation of 
CS dimers from thermophilic and mesophilic organisms 
is about 60° C (see Fig. IH). We note that this value is 
unaffected by the choice of the sequential distance cut¬ 
off, Z, as we take into account all inter-monomer contacts 
in the analysis of the simulation data. Interestingly, this 
temperature difference, 60° C, is consistent with the dif¬ 
ference in the optimal growth temperatures as given in 
Table H 

We analyzed thermal fluctuates and collective motions 
of the active site in CS enzymes from different organ¬ 
isms, both in the open and closed conformations. We 
find that the three residues that are directly involved in 
the chemical reaction of citrate formation exhibit rather 
small fluctuations in the closed conformation. In the 
open conformation, however, the thermal fluctuations of 
these three residues are significantly enhanced. We also 
find that the collective motions of the active site in the 
open conformation are mainly due to the movements of 
the small domain relative to the large domain. In the 
open conformation, due to the presence of the citrate and 
Co A molecules, the small domain is tightly bound to the 
large domain, and the active site residues have no free¬ 
dom to perform any large-amplitude collective motions. 
Taken together, our results show that the relatively sim¬ 
ple structure-based model correctly captures the similar¬ 
ities and differences in thermodynamic and kinetic prop¬ 
erties of the different forms of the CS enzyme. 
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